AITopics | near-optimal sample complexity

Collaborating Authors

near-optimal sample complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

0cc6ee01c82fc49c28706e0918f57e2d-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 11:13:18 GMT

final version, near-optimal sample complexity, planning oracle, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.62)

Add feedback

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Neural Information Processing SystemsDec-23-2025, 18:11:52 GMT

Model-based reinforcement learning (RL), which finds an optimal policy using an empirical model, has long been recognized as one of the cornerstones of RL. It is especially suitable for multi-agent RL (MARL), as it naturally decouples the learning and the planning phases, and avoids the non-stationarity problem when all agents are improving their policies simultaneously using samples. Though intuitive and widely-used, the sample complexity of model-based MARL algorithms has been investigated relatively much less often. In this paper, we aim to address the fundamental open question about the sample complexity of model-based MARL. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model of state transition. We show that model-based MARL achieves a sample complexity of $\tilde \cO(|\cS||\cA||\cB|(1-\gamma)^{-3}\epsilon^{-2})$ for finding the Nash equilibrium (NE) \emph{value} up to some $\epsilon$ error, and the $\epsilon$-NE \emph{policies}, where $\gamma$ is the discount factor, and $\cS,\cA,\cB$ denote the state space, and the action spaces for the two agents. We also show that this method is near-minimax optimal with a tight dependence on $1-\gamma$ and $|\cS|$ by providing a lower bound of $\Omega(|\cS|(|\cA|+|\cB|)(1-\gamma)^{-3}\epsilon^{-2})$. Our results justify the efficiency of this simple model-based approach in the multi-agent RL setting.

model-based multi-agent rl, near-optimal sample complexity, zero-sum markov game, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.84)

Add feedback

We will improve the broader impact section by emphasizing the implications of our theoretical

Neural Information Processing SystemsOct-9-2025, 13:12:53 GMT

We sincerely thank all the reviewers, and feel really honored to receive such positive and constructive comments. We will mention total variation distance in the appendix, and correct the typo on "Corollary Note that the smooth planning oracle is not needed throughout the paper, and is thus not the "primary It is only used in Sec. We have discussed R-MAX in lines 82-83. By saying "especially model-free ones..." this sentence, we simply meant The works on Q-learning in games you mentioned exactly conquered this issue, with non-trivial efforts. We will address all the grammatical comments/typos in the final version.

artificial intelligence, broader impact section, final version, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.62)

Add feedback

Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with a Generative Model

Deng, Zilong, Khan, Simon, Zou, Shaofeng

arXiv.org Artificial IntelligenceMar-20-2025

In this work, we study the sample complexity problem of risk-sensitive Reinforcement Learning (RL) with a generative model, where we aim to maximize the Conditional Value at Risk (CVaR) with risk tolerance level $\tau$ at each step, named Iterated CVaR. We first build a connection between Iterated CVaR RL with $(s, a)$-rectangular distributional robust RL with the specific uncertainty set for CVaR. We develop nearly matching upper and lower bounds on the sample complexity for this problem. Specifically, we first prove that a value iteration-based algorithm, ICVaR-VI, achieves an $\epsilon$-optimal policy with at most $\overset{\sim}{O}\left(\frac{SA}{(1-\gamma)^4\tau^2\epsilon^2}\right)$ samples, where $\gamma$ is the discount factor, and $S, A$ are the sizes of the state and action spaces. Furthermore, if $\tau \geq \gamma$, then the sample complexity can be further improved to $\overset{\sim}{O}\left( \frac{SA}{(1-\gamma)^3\epsilon^2} \right)$. We further show a minimax lower bound of $\overset{\sim}{O} \left(\frac{(1-\gamma \tau)SA}{(1-\gamma)^4\tau\epsilon^2}\right)$. For a constant risk level $0<\tau\leq 1$, our upper and lower bounds match with each other, demonstrating the tightness and optimality of our analyses.We also investigate a limiting case with a small risk level $\tau$, called Worst-Path RL, where the objective is to maximize the minimum possible cumulative reward. We develop matching upper and lower bounds of $\overset{\sim}{O}\left(\frac{SA}{p_{\min}}\right)$, where $p_{\min}$ denotes the minimum non-zero reaching probability of the transition kernel.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2503.08934

Country:

North America > United States > Arizona (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Thailand (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Near-Optimal Sample Complexity in Reward-Free Kernel-Based Reinforcement Learning

Kayal, Aya, Vakili, Sattar, Toni, Laura, Bernacchia, Alberto

arXiv.org Artificial IntelligenceFeb-11-2025

Reinforcement Learning (RL) problems are being considered under increasingly more complex structures. While tabular and linear models have been thoroughly explored, the analytical study of RL under nonlinear function approximation, especially kernel-based models, has recently gained traction for their strong representational capacity and theoretical tractability. In this context, we examine the question of statistical efficiency in kernel-based RL within the reward-free RL framework, specifically asking: how many samples are required to design a near-optimal policy? Existing work addresses this question under restrictive assumptions about the class of kernel functions. We first explore this question by assuming a generative model, then relax this assumption at the cost of increasing the sample complexity by a factor of H, the length of the episode. We tackle this fundamental problem using a broad class of kernels and a simpler algorithm compared to prior work. Our approach derives new confidence intervals for kernel ridge regression, specific to our RL setting, which may be of broader applicability. We further validate our theoretical findings through simulations.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2502.07715

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Thailand (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Review for NeurIPS paper: Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Neural Information Processing SystemsJan-21-2025, 12:06:33 GMT

Additional Feedback: NOTE (post rebuttal): I didn't change my review, as a matter of expediency. But, I appreciate the authors' acknowledgement of my comments and support the plan for addressing them. I guess a brief clarification wouldn't hurt, but I wouldn't suggest using space to delve into the setting more deeply.) "corner stones" - "cornerstones" "the sample complexity of model based MARL algorithms has rarely been investigated": The Rmax paper was one of the first papers to study RL sample complexity AND dealt with MARL. Maybe it hasn't been "recently" investigated?

model-based multi-agent rl, near-optimal sample complexity, zero-sum markov game, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.40)

Add feedback

Review for NeurIPS paper: Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Neural Information Processing SystemsJan-21-2025, 12:06:26 GMT

All reviewers agree that this is a solid work that deserves publication. We encourage the authors to follow the reviewers' suggestions when preparing the camera-ready version of their paper.

model-based multi-agent rl, near-optimal sample complexity, zero-sum markov game, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.40)

Add feedback

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Neural Information Processing SystemsOct-9-2024, 12:21:54 GMT

artificial intelligence, model-based multi-agent rl, near-optimal sample complexity, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Faster Adaptive Decentralized Learning Algorithms

Huang, Feihu, Zhao, Jianyu

arXiv.org Artificial IntelligenceAug-19-2024

Decentralized learning recently has received increasing attention in machine learning due to its advantages in implementation simplicity and system robustness, data privacy. Meanwhile, the adaptive gradient methods show superior performances in many machine learning tasks such as training neural networks. Although some works focus on studying decentralized optimization algorithms with adaptive learning rates, these adaptive decentralized algorithms still suffer from high sample complexity. To fill these gaps, we propose a class of faster adaptive decentralized algorithms (i.e., AdaMDOS and AdaMDOF) for distributed nonconvex stochastic and finite-sum optimization, respectively. Moreover, we provide a solid convergence analysis framework for our methods. In particular, we prove that our AdaMDOS obtains a near-optimal sample complexity of $\tilde{O}(\epsilon^{-3})$ for finding an $\epsilon$-stationary solution of nonconvex stochastic optimization. Meanwhile, our AdaMDOF obtains a near-optimal sample complexity of $O(\sqrt{n}\epsilon^{-2})$ for finding an $\epsilon$-stationary solution of nonconvex finite-sum optimization, where $n$ denotes the sample size. To the best of our knowledge, our AdaMDOF algorithm is the first adaptive decentralized algorithm for nonconvex finite-sum optimization. Some experimental results demonstrate efficiency of our algorithms.

algorithm, inequality, optimization, (12 more...)

arXiv.org Artificial Intelligence

2408.09775

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback